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Abstract 



This paper discusses the problems and possibility of collecting bee dance data 
in a linguistic corpus and use linguistic instruments such as Zipf s law and entropy 
statistics to decide on the question whether the dance carries information of any 

l_J ' kind. We describe this against the historical background of attempts to analyse 

r ) ^ nonhuman communication systems. 



1 Introduction 



^ ' The idea for this paper originated from a small paper that my daughter wrote for her 

nJ . bachelor study biology at the University of Wageningen, Netherlands. In this paper 

^•^ ' she discussed the views on the function of the so-called 'honey bee dance' , more in 

particular the opposition of A. Wenner I18II17I and others against the established 
theory first put forward by K. von Frisch[ 1 6 1 (non vidi) that the shape and direction 
of a dance performed by a bee communicate to other bees the direction and distance 
^— V , of a food source (for a recent discussion of the established views see j^.]). 

As a computerlinguist I am not qualified to judge between the two views. Of 
course the 'bee dance' is often referred to in linguistic textbooks as an example of 
non-human communication, although not many linguists would go so far as to call 
it 'speech' or even 'language'. 

However, I wondered if the techniques that are used in te field of corpus lin- 
guistics could be applied to the data that were collected by the entymologists in 
C^ ' studying the bee dance. More in particular: if it could be demonstrated that the 

data in this corpus had certain features in common with linguistic corpora, this 
could indicate that the bees communicated something. Whether this 'something' 
concerned the location of a particular succulent brand of honey or just local hive 
gossip was (to me) a matter of no concern. 

It rapidly became clear that there already existed ample literature on the subject 
of animal communication. Indeed the first attempt to statistically analyse bee dance 
data in the light of Shannon's information theory is from Haldane and Spurway in 
1954 1 3 1, who used the original data of von Frisch. Also, the theories of Zipf, 
notably Zipf 's laws and the principle of last effort played an important role in the 
analysis of both human language and animal communication systems and even in 




Figure 1 : Honey bee dance (from ^2^) 



the study of manuscripts of unknown origin such as the Voynich manuscript' fUEJ. 

Often the debate on whether to call a certain communication system a 'lan- 
guage' is based on different definitions of 'language' or even philosophical lean- 
ings of the participants in the debate. This is even true in the Wenner attacks on 
the theory of von Frisch. Als Kak |6J remarks: "It appears that the controversy 
is partly of a semantic nature. What does language mean? (...) Operationally 
this means that a language must be associated with a vocabulary of basic signs 
and sounds and a grammar that allows the signs or sounds to be combined into an 
unlimited number of statements T {my emphasis, P.). This key notion of language 
having an unlimited number of elements returns in many papers (e.g. Ujhelyi LISJ ) 
and is also true for the words in human language, as was mathematically proven 
by KornaifTl. Ujhelyi also points out that the main difference between human and 
animal communication is that animal communication only allows for a limited set 
of messages, which are in general genetically fixed. However, the work of Mc- 
Cowan and her collaborators ll2lfrTl at least proves that enough variation exists 
in the communicaton of humans, dolphins and squirrel monkeys to observe Zipf 's 
law at work. 

I started this research in the hope that enough data could be obtained from bee 
dance data to also observe Zipf 's law in action, but it turned out that this kind of 
data was not at all suitable for this kind of analysis. 

Before data can be analysed it must be collected and stored in a suitable format. 
After the explanation of the principles of the honey dance, and the mathematical 
principles underlying the Zipfian laws and entropy, we will proceed to the prob- 
lems of collecting the data and the analysis or translation to a format that is suitable 
for comparing the patterns with those of other (human) languages. 



2 Animal communication 

2.1 The bee dance 

In 1947 it was observed by K. von Frisch 1161 that there was a certain pattern to 
the movements that a bee makes on the comb (see figure^. This pattern described 
approximatly a figure 8, and on the traverse the bee waggles its abdomen as if 
for emphasis. Von Frisch observed that the angle of this traverse with the vertical 
indicated the direction of a honey source (more precise: the angle with the vertical 
correlated with the angle vis-a-vis the sun of the honey source). The duration of 
the dance, then, would indicate the distance. 

Von Frisch and his followers also noted that other bees indeed seemed to ob- 
serve the dance closely, and afterwards acted upon it, therefore they assumed that 
the dance had a communicative function and the bee dance theory was born. Be- 
cause of the exciting nature of this discovery, many authors followed up and metic- 
ulously analized every possible aspect of the dance. We already mentioned the 
controversy generated by Wenner and Wells and the seminal work of Halldane and 
Spurway, but we could add dozens of scientists. Haldane and Spurway computed 
the correct amount of information in a message for non-discrete values such as 
direction, coming to approx. 5 cybernetic units (sic!) as to direction, 4 to 5 as to 
distance and 2 to 3 as to the number of workers needed. This totals to about 12 
bits, equivalent to a human language of 4000 phrases (signifiants with correspond- 
ing signifies), needing less than a hundred words by human or english standards. 
Put differently, a code of all possible combinations of only three characters would 
cover the communication system of the honey bee dance. Towne and Gould fl41 
were among the many scientists who continued research in the spatial precison of 
this communication, giving much attention to the mathematics of communication 
and survival in circumstances where the scatter and quantity of the flowers varied. 
1 was mildly amazed that 1 could not find the observation radius of the recruit in 
flight as a factor in the discussion. 

A typical database for the processing of bee dance data might look like tableQ 
Here, every observation includes three estimates of the angle with the vertical di- 
rection of the central line of the '8'. From this angle and the height of the sun at 
that moment, the direction is computed. For that reason, the azimuth and time-of- 
day are also included in the table. The distance of the honey source is deducted 
from the duration of the dance; for this purpose the number of dances, the total 
time and the average time are noted in the table. Finally the data are translated to 
an X and Y value for subsequent plotting. 

2.2 Other examples 

It is tempting to consider the human communication system as a descendant of evo- 
lutionary earlier systems such as, indeed, the bee language, but of course this is not 
necessarily true. On the contrary, it seems that the communication of humans and 
higher mammals are based on sound and ambiguityl 12 1. The direct predecessor of 
human language should be found in territoriality messages and monogamous duet- 
ting 1151 . As we will see below, it conforms to Zipfian laws and is firmly rooted 
in mathematics. Other animals communicate over a variety of channels, including 
sound, but also in movement, smell such as our honey bees. Haldane and Spur- 
way put forward the notion that the honey bee dance is a higly ritualized intention 
movement. 



'The Voynich manuscript is a 16th century manuscript written in an unknown language and alphabet, but 
probably a hoax 
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Table 1: A series of observations of the bee dance (courtesy M. Beekman). 



Phylogenetically quite near the honey bee we find other communication, e.g. 
of ants. Reznikoval 13 1 notes that the duration of the contact between scout and re- 
cruits is linearly correlated to the number of the travese where food was found, and 
suggests that this is a indication that ants can count. In our opinion this resembles 
more a playback-like report than an indication of counting discrete units. 

Just the observation of an act of the scouting bee and the subsequent reaction 
of the recruits is not enough to call the behaviour 'language' or 'communication', 
even if the reaction of the recruits makes sense in the context. To give an example: 
imagine a student entering his dormitory, smelling of beer and staggering around in 
circles before collapsing in a chair. His friends would observe this behaviour, come 
to the conclusion that at least one pub in the vicinity had opened its doors and walk 
out to see if they can find a place where music and light indicate the presence of an 
open pub. I would hesitate to call the action of the first student 'communication', 
and certainly not that particular communication that is called language. If different 
pubs would sell different beers, that caused different reactions in the 'scout student' 
so that his friends would not only recognize the fact that a pub had opened, but also 
which pub had opened, this still would not be called 'language', because language 
presupposes intentionality(51. But substituting one problem of the meaning of 
'language' by that of the meaning of 'intentionality' does not really help; what we 
need is a model where 'language' is defined and inbedded within the broader term 
of communication. As we will see below, thanks to the work of Ferrer and Sole, 
this is possible within the general framework of Zipf 's laws and the principle of 
least effort. 



3 Least effort and mathematics 



Zipf's law is the observation that frequency of occurrence of some event P as 
a function of the rank i, where the rank is determined by the above frequency 
of occurrence, is a power-law function Pi ~ l/i" with the exponent a close to 
unity. This is true for interesting phenomena such as the frequencies of words in 
human languages, and for the size of the population of cities or the division of 
wealth. Later other researchers such as Wentian Li 1101 have proven that Zipf's 
law also holds for less interesting phenomena, such as randomly generated se- 
quences of characters. Research like that of Cohen, Mantegna and Havlin|, 1 1 tried 
to find the differences between such 'random languages' (my terminology^) and 



^The original term in the paper is 'artificial', but this causes problems with the terms used for computer 
languages and such like. 
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Figure 2; Different efforts for speaker and listener 



real languages. In the paper mentioned here, it was found that the value of the 
Zipf-derivates and especially the entropy of the word frequencies differed after all 
between the natural language texts and the artificial texts. Landini ^ also used 
Zipfs law when he looked for meaning in the Voynich manuscript. The consen- 
sus of all researchers is that the emergence of a Zipf relation between phenomena 
in a language-like construct does not prove that the item under consideration is a 
language, but that a real langua almost certainly displays Zipf behaviour 

Ferrer and Sole |4| describe a double law for Zipf, i.e. the fact that the Zipfian 
curve is best described by two or even more functions. This suggests the existence 
of two 'regimes' in english, one general lexicon (5000 for BNC) and a specialized 
lexicon. 

In a later paper|5 |, Ferrer and Sole formulate an attractive model that estab- 
lishes Zip's law as a necessary traject in the relation between signifiant and signifie 
that lies at the basis of all communication. They use modelling of signals and ob- 
jects in a simple binary matrix a of n signals and m objects (Saussure's signifiant 
and signifie). If a signal refers to an object, the corresponding cell is one, else it is 
a zero. 

In figure|2|there are two examples of such signal-object matrices. The english 
speaker invests no effort when he wants to refer to the furniture, money institute 
or river bank; the word 'bank' covers all three. The listener however has to work 
very hard to extract the true meaning from context. The dutch situation is the 
opposite: the speaker has to select one from three words, whereas the listener 
knows immediatly what is referred to. 

Ferrer and Sole vary the zero's and ones in the matrix using an evolutionary 
algorithm. They then use the signal entropy to compute the minimum cost of the 
combined effort for both parties in such matrices: 

fi(A) = \H^{R\S) + {I- \)H4S) 

where the A parameter weighs the contribution of each party. The following 
equation describes the mutual information for different values of A: 



I„{S,R)=H„{S)^H„{S\R) 
and L describes the effective lexicon size in relation to the number of signals 
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where ^J^i = '^ lij in matrix a. 

If the In{S\R) and L are plotted against A we get the graphs as in figure|3| 
It is immediatly clear that there is a catastrophic transition at A ~ 0.41 both for 
In = {S\R) and L. In a second experiment, Ferrer and Sole plotted the normalized 
frequencies of the signals in the matrices against their rank for different values of A. 
It was found that Zipfs law emerged in a small window near A = 0.41. Together 
it means that Zipfs law is not just a trivial outcome of a simple process, as would 
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Figure 3: Relation between (from ^) 



be suggested by the fact that it is also valid for random languages, but that it is an 
intrinsic part of the mathematic model of communication. 

Considering the graphs of figure|3| we see that animal communication is placed 
in the upper right, where one-to-one signal-object maps are situated. As we have 
already seen, this is not the case for the dolphins and squirrel monkeys of Mc- 
Cowan and possibly other animal systems. 

4 Conclusions 

Corpus Linguistics is the discipline that studies language(groups) from big samples 
of that language(group), with a strong emphasis on quantitative phenomena and 
methods. Traditionally, of course, such language samples were restricted to human 
languages, but with progressive research in biology and animal communication 
systems, corpora of non-human language-like phenomena are a distinct possibility. 
The sounds emitted by dolphins and collected by e.g. McCowan I12II11I clearly 
constitute a corpus, and so may other registrations of animal behaviour constitute 
corpora. 

Our survey so far of animal communication centered on the mathematical qual- 
ities of the data, such as Zipfian distribution and entropy, and we sought to find 
these qualities in the bee dance data. Our main problem was and is therefore 
whether the bee dance language data can be analysed and stored in such a way 
that a Zipfian distribution (if present) can be detected. Partly this depends on the 
quantity of the data-types. If the assumptions of Haldane and Spurway are correct, 
and if there really are 12 bits of information contained in the bee dance, this may 
well be the case. The second and as yet unsolved problem is the articulation of the 
data, i.e. the splitting into meaningfull 'words', and we hope to tackle this problem 
in the next few months. 
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